01. What are Policy Gradient Methods?
What are Policy Gradient Methods?
Policy gradient methods are a subclass of policy-based methods. Watch the video below to learn more!
M3L3 C01 V3
In the Introduction to Policy-Based Methods lesson, you learned about many policy-based methods that could approximate either a deterministic or stochastic policy.
In this lesson, we'll confine our attention to stochastic policies.
## Quiz
SOLUTION:
- Use a softmax activation function in the output layer. This will ensure the network outputs probabilities. For each state input, sample an action from the output probability distribution.
SOLUTION:
- Policy gradient methods are a subclass of policy-based methods.
- Not all policy-based methods are policy gradient methods.
- Both policy-based methods and policy gradient methods directly try to optimize for the optimal policy, without maintaining value function estimates.